From HIV infection to AIDS: 
A dynamically induced percolation transition? 
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The origin of the unusual incubation period distribution in the development of AIDS is largely 
unresolved. A key factor in understanding the observed distribution of latency periods, as well 
as the occurrence of infected individuals not developing AIDS at all, is the dynamics of the long 
lasting struggle between HIV and the immune system. Using a computer simulation, we study the 
diversification of viral genomes under mutation and the selective pressure of the immune system. 
In common infections vast spreading of viral genomes usually does not takes place. In the case of 
an HIV infection this may occur, as the virus successively weakens the immune system by depletion 
of CD4+ cells. In a sequence space framework, this leads to a dynamically induced percolation 
transition, corresponding to the onset of AIDS. As a result, we obtain the prolongated shape of the 
incubation period distribution, as well as a finite fraction of non-progressors that do not develop 
AIDS, comparing well with results from recent clinical research. 



It is a well known empirical fact that incubation times 
of most diseases obey a lognormal distribution, only vary- 
ing in their distributions' mean and dispersion factors. 
This has been verified for many single-exposure, common 
vehicle epidemics and is often referred to as "Sartwell's 
model" [1,2]. Vice versa, this allows for estimates of ex- 
posure types as well as etiology of disease from an ob- 
served incubation period distribution [3-5]. Recently, 
the underlying dynamics that generate the incubation 
period distribution, as well as mechanisms that lead to 
deviations from the common distribution, have gained 
attention [6-8]. 

One of the most prominent examples for a deviation from 
the lognormal case is the distribution of waiting times 
between HIV infection (seroconversion) and the onset 
of AIDS, which is supported by data sets from various 
studies [9-14]. The divergence from lognormality, ex- 
traordinarily long incubation times, and the occurrence 
of non-progressors (patients not developing AIDS) sug- 
gest a more complex generating dynamics than observed 
in other infectious diseases. While much effort has been 
spent on parametric estimates of the incubation period 
distribution [15], we here ask which are possible mech- 
anisms of the underlying dynamics. Any such attempt 
has to take into account the HIV-specific negative feed- 
back to the host's immune system. While the immune 
system develops an epitope-specific answer to HIV, as it 
does to any other antigenic invasion, it is weakened by 
HIV in a way that is not common to other viral infec- 
tions. HIV targets the replicative machinery of CZ)4+ 
cells which are depleted when viruses proliferate. CZ)4+ 
cells as T helper cells are essential actors within an im- 
mune response. Therefore HIV is able to globally weaken 
the host's resistance against antigens. 
In earlier approaches, the onset of AIDS has been associ- 
ated with the passage of an antigenic diversity threshold 
in the framework of differential equation models [16,17]. 
More recently, progress has been made to overcome the 
limitations of analytical models with respect to topologi- 



cal effects in the shape space of receptors and in physical 
space. Cellular automaton models have been defined and 
investigated that show the typical separation between the 
time scales of primary infection and the onset of AIDS 
[18,19]. 

In this article we take an alternative approach and com- 
bine cellular automata with a sequence space frame- 
work in order to model typical characteristics of the time 
course of HIV infection. In the following sections we will 
first define a framework to represent ordinary infections 
within the scope of percolation theory. From there we 
will extend the model to describe the special case of HIV 
infection and discuss the distribution of incubation peri- 
ods. Numerical simulations will be complemented by a 
stochastic model for the origin of the variety in incuba- 
tion period distributions. Finally we discuss our findings 
in the context of empirical data on HIV survival. 

I. PERCOLATION MODEL OF INFECTION 

Along the course of an infection one generally observes 
a diversification of viral genomes due to mutation and 
the selective pressure of the immune system. This co- 
evolutionary dynamics can be modeled within a sequence 
space framework [20]. Representing viral genomes by 
strings of length n, built up from an alphabet of length 
A, we can describe their diversification as spread in se- 
quence space. Analogously, let us assign a sequence to 
the respective immune receptor matching the viral strain. 
Any string in sequence space is assumed to represent a 
viral epitope, as well as its complementary immune re- 
ceptor. Thus each sequence is characterized by a viral 
and an immunological state variable. A mathematical 
framework to describe the dynamics in such a space can 
be found in percolation theory [21] and in theories for 
epidemic spreading, i.e. SIR models [22,23]. However, 
while those models apply cellular automata to the in- 
teraction of organisms, we here apply the mathematical 
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concept to modeling the populations of immune cells and 
viruses within one organism. Adopting the notation of 
SIR models, we call a site in sequence space susceptible, 
if it in principle can harbor a virus. It is denoted as 
infected, if the system contains a virus with an epitope 
motif represented by the site's string. If a viral sequence 
meets immune response it is removed and the system is 
immunized against it. In this case and in case a site in 
principle is inaccessible for a virus, it is called recovered 
(or removed). Aside from this, two immunological states 
are distinguished. An immune receptor shape may or 
may not be present within the immune repertoire. We 
set up a system in which a site is inaccessible for viral se- 
quences with probability Dq accounting for the fact that 
the viral genome is not arbitrary. In addition we intro- 
duce a probability of immunological presence at any site 
in sequence space pis{t) with pis(O) = po- This means 
that for sufficiently large systems the initial density of 
recovered sites is i?(0) = Dq + Pq ~ DqPq. Taking also 
into account the densities of susceptible sites S{t) and of 
infected sites Pv{t) one obtains the relation 



S{t) + pyit) + Rit) = l \/t. 



(1) 



As replication of viral and immunological entities is af- 
flicted with copy fidelities 5^ < 1 and < 1, the system 
shows viral - and in response immunological - spread in 
sequence space. Introducing some viral strains into a so 
far unaffected system leads to a dynamics that is mod- 
eled within the cellular automaton approach by iterating 
the following steps: 

1. Choose a random site. 

2. If the site represents an active immune receptor 

(a) mutate any bit with probability 1 — qis 

(b) if a new immunological strain is generated and 
the mutant matches an infected site reset the 
site's viral status to recovered and assign the 
immunological state to be positive. 

3. If the site is infected 

(a) mutate any bit with probability 1 — Qv 

(b) if a new strain is generated and corresponds 
to a siisceptible site the site gets infected. 

A viral strain generates a specific mutant strain at Ham- 
ming distance d (which is the number of differing bits) 

with probability — which will survive as long 

as it meets a susceptible site. Equally an immunological 
mutant strain is originated with probability — 
under the condition that it coincides with an infected site. 
Otherwise we assume that the immunological mutant is 
not sufficiently amplified to establish a new strain. 



Such a system shows two regimes of qualitatively dif- 
ferent behavior. Below a percolation threshold depend- 
ing on the above parameters the source of infection will 
stay negligible in size compared to the system size, such 
that R{oo) = i?(0) in the limit of infinite system size. 
Above the percolation threshold a virus will spread all 
over the system before it gets defeated. Accordingly 
R{oo) > R{0). 

To determine the threshold conditions within a mean 

field approach ("fully mixed" approximation), we intro- 
duce the following system of differential equations 



dS_ 
dt 

dpv 
dt 

dpis 
dt 
dR 

It 



d=l ^ ^ 
= -{l-q^)p,S 



(A - l)'^ 



= -(1 - iDpisPv + (1 - q'i)Spv 

= (1 - <lis)PvPis 

= (1 - (iDPtsPv 



(2) 
(3) 

(4) 

(5) 



supplemented by the boundary conditions: 

5(0) « 1 - R{Q) 
P.(0) «0 
Pis (0) = po 
i?(0) = i:)o + Po--Dopo• 
With (1 - g,"Jpi«(i) > for ah t and pis{t) = R{t) -Do + 
DqPq one can derive a relation between S{t) and R{t) 



dS 
Itt 



i n dR 

-*- Iv dt 

l-qlR-Do + DoPo' 



which yields 

5(i)=(l -Dq-po + DoPo) 



Po 



R{t) -Do + DoPo 



'(6) 



taking into account the above boundary conditions. To 
evaluate conditions for the percolation threshold we can 
utilize 

-Roo = i?(oo) = 1 - 5(00) =1-500 

because in the stationary state any infected site will re- 
cover. This leads to the following relation for i?oo 



-Roo=l - {I- Do- po + DoPo) 

=/(i?oo). 



Po 



i?oo - Do + DoPo 



(7) 



It is fulfilled for R^c = Dq + po — DoPo = R{0) which 
means that no virus enters the system or at least cannot 
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gain macroscopic areas in sequence space. However, the 
above equation has another solution if 

because /(i?(0)) = R{0), \imR^_,^ f{R^) = 1, 
.f{Roo) < 1 Vi?oc < oo. Evaluating the above condition 
leads to the result that an invading virus can percolate 
in sequence space if 

i-C>i-i?(o)- 

It is worthwhile to consider the case Qig 1 which im- 
plies that the above inequality holds for any R(0) < 1. 
If the immune cells have vanishing mutation rates and 
accordingly lack adaptability, the virus percolates the se- 
quence space in any case - unless immune cells occupy 
any site in sequence space. 

For — 1 we can explicitly determine the asymp- 

totic density of recovered cells from the fixed point equa- 
tion (7) that is solved by i?oo = 1 ^ Po and R^o = 
Do + po — DqPo = R{0). With the additional constraint 
that i?oo > -R(O) one can see that Roo decays linearly 
with increasing po until it is equal to i?(0) and the sub- 
critical regime is reached. This is confirmed by computer 
simulations with various sets of parameters. For the ex- 
ample of Do = 0.5, qv = Qis = 0.95, n = 15, A = 2 this 
leads to a critical immunological density = 0.32 (the 
theoretical value from equation (8) is = |). 
Obviously in common infections the system is below the 
percolation threshold as an adequate immune response 
can defeat a viral attack before strains spread all over 
sequence space. Nonetheless it is not unreasonable to 
assume that the immune system operates near the per- 
colation threshold as unnecessarily high immune receptor 
densities po involve competitive disadvantages. 

II. PERCOLATION TRANSITION FROM HIV 
INFECTION TO THE ONSET OF AIDS 

We are now in the position to extend our model to 
include HIV dynamics. An HIV model has to take care 
of characteristic peculiarities of HIV infections, i.e. the 
destruction of the immune system by the virus. We con- 
sider this by extending the algorithm of section I by the 
following rule: At any iteration step each viral strain is 
given a chance to meet a random immunological clone 
with probability pis (t) which thereafter is destroyed with 
probability p. If the affected site in principle is acces- 
sible for a viral strain the viral status changes back to 
susceptible. We initialize the system near, but below 
the percolation threshold, which is the natural state of 
a healthy immune system. As the system's qualitative 
behavior shows to be insensitive to the specific choice of 



parameters we will in the following choose the parameter 
settings: Do = 0.5, po = 0.325, = qis = 0.95, n = 15, 
A = 2. 
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FIG. 1. Density of viral strains Pv{t) in sequence space 
under evolution of the system (Do = 0.5, po = 0.325, 
q„ = = 0.95, n = 15, A = 2, p = 0.0001). 



0.35 



Pis{t) 




0.1 - 
0.05 - 

I 1 1 1 1 1 1 

500 1000 1500 2000 2500 3000 3500 

t [1000 Iterations] 
FIG. 2. Density of immunologically active sites Pis{t) in 
sequence space (Do = 0.5, po = 0.325, Qv = qis = 0.95, 
n = 15, A = 2, p = 0.0001). Note the analogy to the decline 
in CD4'^ cells under HIV infection. 

Figures 1 and 2 show simulation results for p = 0.0001 
exhibiting characteristics typical to the course of disease 
from HIV infection to the onset of AIDS. One observes a 
drift of viral epitopes due to immune pressure as found 
in HIV-infected individuals [24-30]. Moreover, the sim- 
ulations show fluctuations in the total number of actual 
strains, eventually sharply increasing which corresponds 
to the onset of AIDS [16]. In parallel it is an empirical 
fact that the disease progresses with a depletion of CD4+ 
cells [31,32,25,33] which can be assumed to be accompa- 
nied by a loss in diversity of the immune repertoire as 
shown in figure 2. In this picture the immune system is 
successively weakened while fighting the viral attack and 
ultimately breaks down when the virus begins to perco- 
late in sequence space. The virus dynamically drives the 
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system from a subcritical regime above the percolation 
threshold. 

It will be interesting to investigate the distribution of 

waiting times until percolation among systems only dif- 
fering in their random initialization, which corresponds 
to the incubation period distribution. 
To understand the generated distribution from a theoret- 
ical point of view we have to take care of the stochastic 
nature of py as seen in figure 1. We assume that p„ has 
a time dependent growth rate r(t) that is superposed 
by noise and accordingly follows a generalized geometric 
Brownian motion (cp. appendix). This process Pv has a 
lower absorbing boundary for Pv{t) = 2~^^ and converts 
into exponential growth after having passed an upper 
point of no return p%. The first passage time distribution 
with respect to the upper boimdary corresponds to the 
incubation period distributions under investigation. It 
is derived in the appendix and will be discussed in the 
context of simulation results and empirical HIV data in 
the following section. 

III. RESULTS AND DISCUSSION 

We have run simulations as described in section II for 
various sets of parameters qualitatively leading to the 
same results for the time course of pv and p,s, as long as 
the system is initialized near but below the percolation 
threshold. For the following discussion let us choose the 
parameter settings Do = 0.5, po = 0.325, <7„ = qis — 0.95, 
n = 15, A = 2. The virgin system is infected within 
a ball that includes one and two bit mutants leading 

pM = 2-i5(l + (i^^) + (i2'^))(l-i?o-po+CoPo) ~ 0.0012. 
A lower absorbing boundary of p^ is given by 2~^^ as less 
than one viral strain cannot exist. Further evaluation of 
the simulations yields estimates of p^-, = 0.002 where the 
virus begins to percolate. Taking this together, we will 
be able to analyze the simulation results from the point 
of view of first passage time distributions (cp. appendix). 
We have run simulations for various choices of p mimic- 
ing viruses with different aggressiveness towards the im- 
mune system. For p as large as 0.005 we hardly see any 
time period of struggle between the immune system and 
the virus leading to an immediate exponential growth of 
Pv. The system shows very short incubation periods and 
vanishing probability of viral defeat. The distribution of 
incubation periods can then be approximated by a sim- 
ple inverse Gaussian distribution. Decreasing p leads to 
longer incubation periods that correspond to periods of 
combat between virus and immune system as observed 
in figure 1. 

For further discussions we will focus on simulations with 
p = 0.0001 as they show a distribution of incubation pe- 
riods that are in best accordance with real data on HIV 
incubation periods. Nonetheless the theoretical frame- 
work as developed in the appendix will be equally appli- 



cable for arbitrary choice of p. 

Figure 3 offers a comparison of a survival function gen- 
erated by our cellular automaton model with the respec- 
tive data describing the probability that a HIV positive 
patient has not yet developed AIDS at time t after sero- 
conversion. The HIV data are taken from a seroconverter 
study undertaken at the Robert Koch Institut within the 
CASCADE collaboration [34]. 
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FIG. 3. Compaxison of the probability for HIV positives 
not yet to have developed AIDS with a survival distribution 

generated by our simulations (after adequate renormalization 
of the time axis, Do = 0.5, po = 0.325, g„ = qts = 0.95, 
n = 15, A = 2, p„(0) = 0.0012, = 0.002). 

Figure 3 shows that the model reproduces main charac- 
teristics of the real system. The numerical simulations re- 
sult in a survival function that is similar to that observed 
from HIV patients. In particular, they predict the occur- 
rence of long-term survivors as observed in reality and 
link it to a dynamical percolation mechanism. We would 
like to emphasize that in this framework a quantitative 
comparison of our model parameters with experimental 
data is not very meaningful. However, any parameter 
setting that corresponds to a system that is initially be- 
low the percolation threshold and that is attacked with 
moderate aggressiveness (moderate values of p) will show 
the same qualitative behavior. This demonstrates the 
robustness of our model and ensures its applicability to 
even larger sequence spaces than those simulated here. 
Furthermore let us analyze the data in the light of the 
first passage time distributions derived in the appendix. 
We have to specify the functional from of the viral growth 
rate r{t). Different from the case of a very aggressive 
virus (large p), a constant growth rate r{t) = p, > 
does not fit the simulation results for viruses that are 
only moderately destructive (small p). Therefore let us 
approximate r{t) underlying the simulations by an ex- 
pansion in powers of t as 

r{t) = /i + -ft. 

Such a simple approach may not exactly reproduce the 
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waiting time distribution but can show the origin of its 
characteristics. This is exemphfied by figure 4 where the 
incubation period distributions corresponding to the sur- 
vival curves shown in figure 3 arc approximated by a first 
passage time distribution with /j. = 0.064, 7 = —0.0092 
and cr^ = 0.0091. This corresponds to the picture that 
the viral species initially is able to establish new strains 
but that its opportunities for spreading in sequence space 
are successively diminished. In many cases the virus nev- 
ertheless is able to percolate sequence space if its sup- 
pression takes effect too slowly. This happens in a non- 
deterministic manner due to stochastic fluctuations cor- 
responding to CT^ > and generates the observed incu- 
bation period distribution. 
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FIG. 4. Comparison of the incubation period distributions 
(IPD) corresponding to figure 3 with the theoretical model 
with r(t) = 0.064 - 0.0092f, = 0.0091. 



tion [35] . This is well in accordance with our model which 
for p = predicts that no percolation will occur. More 
recently it has been shown that also in individuals with 
heterozygous genotypes a slower progression to AIDS can 
be observed. Moreover those patients have a 70% re- 
duced risk to maintain the HIV infection and develop 
AIDS [36]. Therefore, already a reduction of CCR5 re- 
ceptors on CD4+ cells, making viral fusion more difficult, 
improves the chance for prolonged or even total survival. 
This fits well with the predictions of the model for de- 
crease in p. 

Recent progress in vaccine research [37-39] further sup- 
ports the model. From the model's point of view, vacci- 
nation corresponds to a local raise of immune receptors' 
density po- This drives the system far below the percola- 
tion threshold and accordingly HIV will hardly manage 
to spread in sequence space. 

In conclusion the above HIV/AIDS phenomenology can 
be interpreted within our cellular automaton model. 
Prolonged survival as well as a finite fraction of non- 
progressors can be traced back to the enhanced stability 
below the percolation transition in this framework. Con- 
sequently, from the percolation model's point of view, 
vaccination and receptor blocking are encouraged as effi- 
cient strategies to overcome an HIV infection. 

C.K. would like to thank the Stiftung der Deutschen 
Wirtschaft for financial support. 



APPENDIX: 



Limitations of the linear approximation become obvi- 
ous with increasing t. r{t) is unbounded for negative 
numbers leading to arbitrarily large destruction of viral 
strains with time. As a consequence, the corresponding 
incubation period distribution shows an unrealistic cut- 
off for large t. This disappears when considering further 
terms in the expansion of r(t) expanding the regime of 
applicability of the theoretical model. 
Describing the behavior of incubation periods within our 
model we can summarize that one observes an increase 
in waiting times before percolation and an enlarged frac- 
tion of cases where viral strains get totally extinct with 
decreasing p, i.e. less aggressive viral strains. This finds 
clear correspondence in real HIV statistics, p is a mea- 
sure for the vulnerability of the immune system under 
the attack of HIV. This virus manages its destructive 
penetration into T helper cells (CZ)4+ cells) not only by 
membrane fusion mediated by CDA but generally needs 
an additional co-receptor which is referred to as CCR5. 
As almost all HIV strains rely on this mechanism for 
replication in T cells, individuals who show a homozy- 
gous mutation leading to a non-expression of the CCR5 
receptor have proven to be resistant against HIV infec- 



First passage time distributions for geometric 
Brownian motion between two absorbing boundstries 

Facing the stochastic nature of pv{t) we choose an 
ansatz in the regime before the percolation transition 
that expects a time dependent viral growth rate r(t) of 
Pv{t) which is superposed by noise. In terms of a stochas- 
tic differential equation this can be written as 



dp^t) = r{t)py{t)dt + p^{t)dBt{0, CT^ 



(9) 



with St (0,0-^) denoting Brownian motion with mean 
and variance a^t. Within the Stratonovich interpretation 
[40] this equation leads to 



Pv{t) = p„(0)e«(*)+^'(O''^') 

R{t)= [ r{t')dt'. 
Jo 



(10) 
(11) 



Accordingly py is described by geometric Brownian mo- 
tion that is locked between two absorbing boundaries at 
2~" (less than one strain cannot exist) and an upper criti- 
cal concentration that leads to percolation of the virus. 
This can be translated to Brownian motion Bt{R{t),a^) 
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(mean R(t) and variance a^t) with Bq = and limited 

by 



-a = In 



6 = In 



PviO) 

Pv 
Pv{0) 



< 



> 0. 



The probabihty density j3(a;, t) describing the distribution 
of the stochastic variable Bt{R{t),a'^) is determined by 
the following Fokker-Planck equation [41,42] 



dp{x,t) , ,d , , (7^ 92 



dt 



d 



J{x,t) 



(12) 
(13) 

(14) 



J{—a,t) and J{h,t) represent the contributions of the 
probability flow being absorbed at the boundaries — a < 
and & > at time t. In other words J{b,t)dt is the 
probability that Bt{R{t),a^) reaches b for the first time 
in [t,t + dt[ under the additional condition that it has 
not yet met the absorbing boundary at — o. However, 
this means that J{b, t) is equivalent to the first passage 
time distribution of the process py (t) with respect to the 
upper boundary , again requiring that it has not passed 
the lower absorbing boundary at 2~". Note that J{b,t) 
represents a defective probability distribution in t as the 
upper boundary is not reached with probability 1. 
Accordingly it remains to solve (12) with respect to the 
following initial and boundary conditions: 



p{x,0)=6{x) Vx G 
p{-a,t)=0 \/t 

p{b,t) = o yt. 



-a,b] 



Having the reflection principle in mind one can derive a 
solution under this conditions as an adequate superposi- 
tion of Gaussian distributions [41-43]. From this one can 
easily deduce using (14) 



J{b,t) = 



F{a, b, aH) (i.-a(t))^ 



F{a,b, aH) = - 



2b(a + b) 

e o^* 



(-a(l 



2fc(a+!)) 



)+b{e 



2aia+b) 



(15) 



2(a + b)2 

e~^^ - 1 



Obviously, in case of only one absorbing boundary (and 
r{t) = fi, R{t) = fit) we get the inverse Gaussian dis- 
tribution as a well known solution for this special prob- 
lem [44]. A parameter setting of Dq = 0.5, po = 0.325, 
= = 0.95, n = 15, A = 2, p,(0) = 2-i5(l + (is) + 
( 2^))(1 — Dq — P0 + -D0P0) ~ 0.0012 as discussed in section 
HI leads to a = 3.7 and b = 0.51. 
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